Skip to content

fix: harden NAR hash validation and improve URL parsing [backport #842]#858

Merged
kalbasit merged 1 commit intorelease-0.8from
backport-842-to-release-0.8
Feb 12, 2026
Merged

fix: harden NAR hash validation and improve URL parsing [backport #842]#858
kalbasit merged 1 commit intorelease-0.8from
backport-842-to-release-0.8

Conversation

@kalbasit
Copy link
Owner

This change strengthens the validation of NAR hashes by enforcing a strict Nix32 format (52 characters) and optional narinfo hash prefix. The ParseURL function has been rewritten to be more robust, separating the hash and compression components more reliably and validating the hash before further processing.

Key improvements:

  • Defined NormalizedHashPattern and HashPattern for precise hash validation.
  • Enhanced ParseURL to correctly handle various NAR URL formats including those with query parameters.
  • Improved the Normalize method in the URL struct to better handle narinfo hash prefixes and sanitize the hash against path traversal.
  • Updated all relevant tests to use valid Nix32 hashes and removed redundant test cases.
  • Ensured consistent URL construction in JoinURL by correctly handling paths and query parameters.

This hardening prevents potential security issues related to invalid or malicious NAR URLs and ensures consistent behavior across the application.

(cherry picked from commit 2bb8649)

This change strengthens the validation of NAR hashes by enforcing a
strict Nix32 format (52 characters) and optional narinfo hash prefix.
The ParseURL function has been rewritten to be more robust, separating
the hash and compression components more reliably and validating the
hash before further processing.

Key improvements:
- Defined NormalizedHashPattern and HashPattern for precise hash validation.
- Enhanced ParseURL to correctly handle various NAR URL formats including
  those with query parameters.
- Improved the Normalize method in the URL struct to better handle
  narinfo hash prefixes and sanitize the hash against path traversal.
- Updated all relevant tests to use valid Nix32 hashes and removed
  redundant test cases.
- Ensured consistent URL construction in JoinURL by correctly handling
  paths and query parameters.

This hardening prevents potential security issues related to invalid or
malicious NAR URLs and ensures consistent behavior across the
application.

(cherry picked from commit 2bb8649)
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Feb 12, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kalbasit, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the reliability and security of NAR hash validation and URL parsing within the system. By enforcing a strict Nix32 hash format and refining the URL parsing logic, it prevents potential vulnerabilities and ensures consistent handling of NAR resources. Additionally, it improves the testability and robustness of S3 storage interactions by allowing custom HTTP transports and adding extensive error path testing.

Highlights

  • NAR Hash Validation Hardening: Strengthened NAR hash validation by enforcing the strict Nix32 format (52 characters) and allowing for an optional narinfo hash prefix, improving security and consistency.
  • Improved URL Parsing: Rewrote the ParseURL function for enhanced robustness, ensuring reliable separation of hash and compression components and implementing early hash validation before further processing.
  • New Hash Patterns: Introduced NormalizedHashPattern and HashPattern constants to precisely define and validate Nix32 hashes.
  • Updated Test Cases: Updated test data and cases across pkg/cache and pkg/nar to align with the new 52-character Nix32 hash format and validate the improved URL parsing.
  • S3 Transport Configuration: Enhanced S3 storage configuration by adding an http.RoundTripper option, enabling custom HTTP transport for S3 requests and improving testability.
  • Comprehensive S3 Error Path Testing: Expanded S3 storage tests with extensive error path scenarios for various operations like bucket access, listing, getting, putting, and deleting NAR/narinfo objects, utilizing a mock HTTP transport.
Changelog
  • pkg/cache/cache_test.go
    • Updated collisionHash and NarInfoText to use 52-character Nix32 hashes in test data.
  • pkg/nar/filepath_test.go
    • Modified test cases to use a 52-character Nix32 hash for NAR file path generation.
  • pkg/nar/hash.go
    • Added NormalizedHashPattern and HashPattern constants for Nix32 hash validation.
    • Updated narHashRegexp to use the new HashPattern.
    • Imported github.com/kalbasit/ncps/pkg/narinfo.
  • pkg/nar/url.go
    • Replaced narRegexp with hashValidationRegexp using the new HashPattern.
    • Rewrote ParseURL to extract hash, compression, and query parameters more robustly, including early hash validation.
    • Corrected the argument order for assert.ErrorIs in TestParseURL.
  • pkg/storage/s3/s3.go
    • Imported net/http.
    • Added Transport http.RoundTripper field to Config struct.
    • Configured the MinIO client to use the provided Transport.
  • pkg/storage/s3/s3_test.go
    • Imported errors, fmt, io, net/http.
    • Defined roundTripperFunc type and helper functions (s3NotFoundResponse, s3OKResponse) for mocking HTTP responses.
    • Added new test suites (TestBucketAccess_ErrorPaths, TestWalkNarInfos_ErrorPaths, TestWalkNarInfos_Structure, TestHasNarInfo_ErrorPaths, TestDeleteNarInfo_ErrorPaths, TestNarInfo_ErrorPaths, TestHasNar_ErrorPaths, TestPutNar_ErrorPaths, TestNar_ErrorPaths) to cover various error scenarios for S3 operations using the mock transport.
Activity
  • No human activity (comments, reviews, progress updates) was provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added bug Something isn't working go Pull requests that update go code labels Feb 12, 2026
@dosubot
Copy link

dosubot bot commented Feb 12, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

kalbasit's Space

S3-Compatible Storage Backend
View Suggested Changes
@@ -1,9 +1,7 @@
 ## S3-Compatible Storage Backend: Design & Implementation
-
 The S3-compatible storage backend for ncps provides a scalable, cloud-native alternative to local filesystem storage. It is implemented in the `pkg/storage/s3` package and is a drop-in replacement for local storage, supporting AWS S3, MinIO, and other S3-compatible services. The backend implements the same storage interface, allowing seamless integration and migration between storage types. [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/storage/s3/README.md#L3-L183)
 
 ### Configuration Options
-
 Configuration is managed via CLI flags, environment variables, or configuration files. The core options are:
 
 | Option                        | Env Variable                  | Required | Description                                 |
@@ -18,7 +16,6 @@
 The endpoint should be provided **without** the URL scheme in code (e.g., `"localhost:9000"`), but CLI flags may include the scheme (`http://` or `https://`). The scheme takes precedence over the `use-ssl` flag. [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/README.md#L46-L782)
 
 #### Example: AWS S3
-
 ```bash
 ncps serve \
   --cache-hostname=ncps.example.com \
@@ -33,7 +30,6 @@
 ```
 
 #### Example: MinIO
-
 ```bash
 ncps serve \
   --cache-hostname=ncps.example.com \
@@ -48,7 +44,6 @@
 ```
 
 ### Supported Providers
-
 The backend supports AWS S3, MinIO, and any S3-compatible service (such as Ceph). Provider-specific notes:
 
 - **AWS S3**: Requires region and SSL; endpoint is typically `s3.<region>.amazonaws.com`.
@@ -56,7 +51,6 @@
 - **Other S3-Compatible**: Use the appropriate endpoint and credentials for your provider.
 
 ### Integration with the Storage Interface
-
 The S3 backend implements the `storage.Store` interface, providing methods for CRUD operations on secret keys, narinfo files, and nar objects. This allows it to be used interchangeably with the local storage backend. Initialization is performed via:
 
 ```go
@@ -81,7 +75,6 @@
 [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/storage/s3/example.go#L16-L81)
 
 ### Path Sharding
-
 To prevent performance bottlenecks from too many files in a single directory, the backend shards objects using the first one and two characters of the object's hash. For example, a narinfo file with hash `abc123` is stored at `store/narinfo/a/ab/abc123.narinfo`. This is implemented via helper functions:
 
 ```go
@@ -94,7 +87,6 @@
 [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/helper/filepath.go#L5-L38)
 
 #### Storage Structure
-
 ```mermaid
 graph TD
     A["bucket"]
@@ -107,11 +99,9 @@
 ```
 
 ### Error Handling
-
 The backend translates S3-specific errors to project-level errors. For example, S3 `NoSuchKey` errors become `storage.ErrNotFound`, and attempts to overwrite existing objects return `storage.ErrAlreadyExists`. Configuration is validated before initialization, and bucket existence is checked. All operations include OpenTelemetry tracing for observability. [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/storage/s3/s3.go#L1-L554)
 
 ### Migration from Local Storage
-
 To migrate from local storage to S3:
 
 1. Create an S3 bucket or MinIO instance.
@@ -139,7 +129,6 @@
 ### Usage Examples
 
 #### AWS S3 (Go)
-
 ```go
 cfg := s3.Config{
     Bucket:          "my-nix-cache",
@@ -153,7 +142,6 @@
 ```
 
 #### MinIO (Go)
-
 ```go
 cfg := s3.Config{
     Bucket:          "my-nix-cache",
@@ -166,7 +154,6 @@
 ```
 
 #### Store Operations
-
 ```go
 // Store a secret key
 secretKey, err := signature.LoadSecretKey("your-secret-key-content")
@@ -181,7 +168,6 @@
 [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/storage/s3/example.go#L16-L81)
 
 ### Development Setup with MinIO
-
 To develop and test with MinIO:
 
 1. Start MinIO locally (the dev script manages this for you):
@@ -198,8 +184,51 @@
 
 The script auto-restarts on code changes and configures MinIO for local development. The S3 backend will use the MinIO instance at `localhost:9000` with default credentials (`minioadmin`/`minioadmin`). [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/README.md#L46-L782)
 
+### Kubernetes Development Environment (Kind)
+You can also develop and test with MinIO running inside a local Kubernetes cluster using [Kind](https://kind.sigs.k8s.io/) and the provided setup script:
+
+1. Start the Kubernetes development environment:
+
+   ```bash
+   ./dev-scripts/k8s-cluster.sh
+   ```
+
+   This script will:
+   - Create a Kind cluster (if not already running)
+   - Deploy MinIO with a pre-created bucket (`ncps-bucket`) and access keys
+   - Deploy PostgreSQL, MariaDB, and Redis instances
+   - Output connection details for all services
+
+2. To access MinIO from your host, port-forward the service:
+
+   ```bash
+   kubectl port-forward -n minio svc/minio 9000:9000
+   ```
+
+3. Use the following credentials and endpoint for your S3 backend configuration:
+
+   | Option        | Value             |
+   |--------------|-------------------|
+   | Endpoint     | http://localhost:9000 |
+   | Bucket       | ncps-bucket       |
+   | Access Key   | ncps-access-key   |
+   | Secret Key   | ncps-secret-key   |
+   | Use SSL      | false             |
+
+   Example CLI usage:
+
+   ```bash
+   ncps serve \
+     --cache-storage-s3-bucket=ncps-bucket \
+     --cache-storage-s3-endpoint=http://localhost:9000 \
+     --cache-storage-s3-access-key-id=ncps-access-key \
+     --cache-storage-s3-secret-access-key=ncps-secret-key \
+     --cache-storage-s3-use-ssl=false
+   ```
+
+The script ensures MinIO is ready and pre-configured for development, so you do not need to manually create buckets or credentials. This environment is ideal for testing NCPS against S3-compatible storage and database backends in a reproducible local cluster.
+
 ### Security Considerations
-
 Store credentials securely (prefer environment variables or IAM roles for AWS). Use SSL/TLS in production. For AWS, restrict access with IAM policies and consider temporary credentials. Enable bucket versioning and set appropriate bucket policies for access control. [Source](https://github.com/kalbasit/ncps/blob/b7653d4303670bb88fd34b8bf82ed3aade09885d/pkg/storage/s3/README.md#L3-L183)
 
 ---

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great improvement. It hardens NAR hash validation, making the system more secure. The refactoring of ParseURL is a significant enhancement in terms of robustness and readability. Additionally, the introduction of a mock transport for the S3 storage backend and the extensive new tests greatly improve test coverage and code quality. I found one issue regarding the use of filepath.Base for URL parsing which could cause issues on different operating systems.

pathPart, rawQuery, _ := strings.Cut(u, "?")

// Get the filename (last component of the path)
filename := filepath.Base(pathPart)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The filepath package is intended for operating system-specific file paths, which can lead to incorrect behavior when parsing URLs on different platforms (e.g., Windows vs. Linux). For handling URL paths, the path package should be used to ensure consistent behavior.

For example, on Windows, filepath.Base("nar/somehash.nar") would return "nar/somehash.nar", causing the parsing to fail. Using path.Base will correctly return "somehash.nar" on all systems.

Please import the path package and use path.Base here.

Suggested change
filename := filepath.Base(pathPart)
filename := path.Base(pathPart)

@kalbasit kalbasit merged commit 025e176 into release-0.8 Feb 12, 2026
14 checks passed
@kalbasit kalbasit deleted the backport-842-to-release-0.8 branch February 12, 2026 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working go Pull requests that update go code size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant